Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an API for generating quotes at compile-time using macros #35

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

smarter
Copy link
Collaborator

@smarter smarter commented Aug 12, 2023

Because the carac dsl is implemented in Scala, it is possible to dynamically compose programs based on information only known at runtime, but most of the time this level of dynamicity is not necessary because the program is known at compile-time and only facts need to be loaded at runtime.

We can take advantage of this by leveraging the existing quote backend to generate programs at compile-time using the standard Scala macro mechanism, the result is faster than both the interpreter and lambda backend on at least one simple benchmark:

BenchMacro.simple_interpreter  thrpt   10  28,511 ± 1,442  ops/s
BenchMacro.simple_lambda       thrpt   10  27,863 ± 0,397  ops/s
BenchMacro.simple_macro        thrpt   10  31,917 ± 0,334  ops/s

It'd be interesting to try to extend this system to generate code that can still be re-optimized at runtime with the JIT.

smarter and others added 6 commits August 25, 2023 19:54
Because the carac dsl is implemented in Scala, it is possible to dynamically
compose programs based on information only known at runtime, but most of the
time this level of dynamicity is not necessary because the program is known at
compile-time and only facts need to be loaded at runtime.

We can take advantage of this by leveraging the existing quote backend to
generate programs at compile-time using the standard Scala macro mechanism,
the result is faster than both the interpreter and lambda backend on at least
one simple benchmark:

    BenchMacro.simple_interpreter  thrpt   10  28,511 ± 1,442  ops/s
    BenchMacro.simple_lambda       thrpt   10  27,863 ± 0,397  ops/s
    BenchMacro.simple_macro        thrpt   10  31,917 ± 0,334  ops/s

It'd be interesting to try to extend this system to generate code that can
still be re-optimized at runtime with the JIT.
We still need a small workaround (the macro call needs an explicit `this.`
prefix), but we no longer need to use a different package.
Before:
    Benchmark                          Mode  Cnt  Score   Error  Units
    BenchMacro.ackermann_opt_macro    thrpt   10  1,812 ± 0,019  ops/s
    BenchMacro.ackermann_worst_macro  thrpt   10  0,097 ± 0,002  ops/s
After:
    BenchMacro.ackermann_opt_macro    thrpt   10  1,727 ± 0,042  ops/s
    BenchMacro.ackermann_worst_macro  thrpt   10  1,294 ± 0,036  ops/s
@smarter smarter force-pushed the macro branch 5 times, most recently from a62cda6 to 526c32a Compare August 26, 2023 14:34
Compute the sort order at runtime and directly use it without an extra
compilation step: instead of re-ordering the code we generate, we re-order the
data the code we generate operates on.

Benchmarks on my unplugged laptop:

Before:
    Benchmark                          Mode   Cnt  Score   Error  Units
    BenchMacro.ackermann_opt_macro     thrpt   10  1,026 ± 0,036  ops/s
    BenchMacro.ackermann_worst_macro   thrpt   10  0,773 ± 0,022  ops/s

After:
    BenchMacro.ackermann_opt_macro     thrpt   10  2,386 ± 0,128  ops/s
    BenchMacro.ackermann_worst_macro   thrpt   10  2,548 ± 0,144  ops/s

This is comparable to results on the lambda backend:
    BenchMacro.ackermann_opt_lambda    thrpt   10  2,556 ± 0,075  ops/s
    BenchMacro.ackermann_worst_lambda  thrpt   10  2,636 ± 0,093  ops/s
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants